A semi-deterministic ensemble strategy for imbalanced datasets (SDEID) applied to bankruptcy prediction
نویسندگان
چکیده
In the last decade, there was a rapid growth in the availability and use of credit for Brazilian companies. Until recently, the decision to grant credit was based on human trial to evaluate the risk of insolvency. Increased demand from companies for credit has led to the use of more accurate models for bankruptcy prediction. In recent years much progress has occurred in the process of drawing up a model fostered by increased competition among financial institutions, changes in the economic environment for businesses and advances in computational techniques. This article discusses and presents alternatives for some of the main problems in the preparation of models for bankruptcy prediction with the application of data mining techniques. The first problem approached is the class imbalance that may cause a poor classification performance and it is treated jointly with an ensemble strategy. The other one rely on the selection of the most significant combination of attributes, the financial variables, which have been widely studied in insolvency prediction. Finally, it is presented a case study in a real world data base of Brazilian companies.
منابع مشابه
Ensembles of Local Linear Models for Bankruptcy Analysis and Prediction
Bankruptcy prediction is an extensively researched topic. Also ensemble methodology has been applied to it. However, the interpretability of the results, so often important in practical applications, has not been emphasized. This paper builds ensembles of locally linear models using a forward variable selection technique. The method applied to four datasets provides information about the import...
متن کاملAn improved boosting based on feature selection for corporate bankruptcy prediction
With the recent financial crisis and European debt crisis, corporate bankruptcy prediction has become an increasingly important issue for financial institutions. Many statistical and intelligent methods have been proposed, however, there is no overall best method has been used in predicting corporate bankruptcy. Recent studies suggest ensemble learning methods may have potential applicability i...
متن کاملA Genetic Algorithm-Based Heterogeneous Random Subspace Ensemble Model for Bankruptcy Prediction
Ensemble classification involves combining multiple classifiers to obtain more accurate predictions than those obtained using individual models. Ensemble techniques are known to be very useful in improving the generalization ability of a classifier. The random subspace ensemble technique is a simple but effective method of constructing ensemble classifiers, in which some features are randomly d...
متن کاملAn experimental comparison of ensemble of classifiers for bankruptcy prediction and credit scoring
In this paper, we investigate the performance of several systems based on ensemble of classifiers for bankruptcy prediction and credit scoring. The obtained results are very encouraging, our results improved the performance obtained using the stand-alone classifiers. We show that the method ‘‘Random Subspace” outperforms the other ensemble methods tested in this paper. Moreover, the best stand-...
متن کاملMachine learning algorithms in air quality modeling
Modern studies in the field of environment science and engineering show that deterministic models struggle to capture the relationship between the concentration of atmospheric pollutants and their emission sources. The recent advances in statistical modeling based on machine learning approaches have emerged as solution to tackle these issues. It is a fact that, input variable type largely affec...
متن کامل